Teaching Statistics with Quarto

Thomas J. Clark

Dordt University

Introduction to Quarto

Quarto

  • Integrates text and code seamlessly in one document

  • R is the underlying statistical programming language

  • Also can use other languages like Python, Julia, etc.

  • Quarto builds on, expands, improves R-Markdown

  • Easily create slides, html documents, pdfs, and more all using one framework

  • Integrate LaTeX naturally within the document

  • Works with R-Studio

Context For Why I’m Using Quarto

My Statistics Background

  • Transitioning from Math to Statistics/Data Science/Actuarial Science

  • Taught our Stat 320 - Mathematical Statistics course this past Spring

  • I’ve also taught our Introductory Statistics and Intermediate Statistics courses, but they use applets not R

  • I plan to incorporate Quarto next time.

Course Textbook

Pedagogical Choices

In class I used Quarto documents, which were shared with students to:

  • Organize and display notes/information

  • Set up Examples to work through together

  • Give “Activities” for students to work through in groups

  • Post the completed “class document” after class as a .qmd file and the compiled .html file for reference.

Homework/Projects/Exams

  • Assignment templates written in Quarto

  • Problem statements pre-written and spaced with room to work.

  • Data files and code snippets preloaded.

Examples

Exploratory Data Analysis

head(FlightDelays)
  ID Carrier FlightNo Destination DepartTime Day Month FlightLength Delay
1  1      UA      403         DEN      4-8am Fri   May          281    -1
2  2      UA      405         DEN     8-Noon Fri   May          277   102
3  3      UA      409         DEN      4-8pm Fri   May          279     4
4  4      UA      511         ORD     8-Noon Fri   May          158    -2
5  5      UA      667         ORD      4-8am Fri   May          143    -3
6  6      UA      669         ORD      4-8am Fri   May          150     0
  Delayed30
1        No
2       Yes
3        No
4        No
5        No
6        No

Exploratory Data Analysis

The mean only tells you so much, a boxplot is a great way to compare quantitative data.

ggplot(data = FlightDelays, aes(x = Delay, y = Carrier )) + geom_boxplot()

Exploratory Data Analysis

What if we remove the outliers?

ggplot(data = FlightDelays, aes(x = Delay, y = Carrier )) +
  geom_boxplot(outlier.shape = NA) + xlim(-20,20)

Exploratory Data Analysis

Is there an association between flight length and delay?

ggplot(data = FlightDelays) + 
  geom_point(aes(x = FlightLength, y = Delay,color = Carrier))

Bayesian Analysis

Eleven urns contain red and blue marbles. One has 0 red and 10 blue, one has 1 red and 9 blue, etc. One urn is chosen randomly. Marbles are drawn one at a time with replacement. The draws are, in order, as follows:, R, R, B, B, B, B, B, B, B, B.

100 Random Walks

dice <- c(1,2,3,4,5,6)
n <- 100
N <- 100*n
L <- 200
ts <- c(1:L)
ts2 <- rep(ts,n)
ys2 <- numeric(N)
rounds <- rep(as.character(1),N)
for (j in 1:n){
  ys2[1 + L*(j-1)] <- 100
  rounds[1 + L*(j-1)] <- as.character(j)
  for (k in 2:L){
  ys2[k+L*(j-1)] <- ys2[k-1 + L*(j-1)] + sample(dice,1) + sample(dice,1) - 7
  rounds[k + L*(j-1)] <- as.character(j)
}
}
c <- 2.47*2
envelope1 <- function(x){
  100 + c*sqrt(x) }
envelope2 <- function(x){
  100 - c*sqrt(x) }
df2 <- data.frame(ts2,ys2,rounds)
plot1 <- ggplot(data = df2, aes(x = ts2, y = ys2)) + 
  geom_line(aes(color = rounds)) + 
  stat_function(fun = envelope1, color = "blue") +
  stat_function(fun = envelope2, color = "blue")

100 Random Walks

plot1

LaTeX

If you want to present some information, but you also want to toss in mathematical formulas like \(\displaystyle T^* = \frac{\bar{X}^* - \bar{x}}{S^*/\sqrt{n}}\) ,

and also run a bit of code

qnorm(0.975, mean = 0, sd = 1)
[1] 1.959964

then Quarto is great.

Memes

Lessons Learned

Should You Switch to Quarto?

  • Are you using R already?

  • Great for presenting information.

  • Great for integrating code.

  • Great for homework documents (easy pdfs).

  • Example together then Activity in groups as a pedagogy/structure works really well.

Questions?

References

  1. Chihara LM, Hesterberg TC (2018). Mathematical Statistics with Resampling and R, 3rd edition. John Wiley & Sons, Hoboken, NJ. ISBN 978-1-119-41653-1

  2. https://quarto.org/